Goto

Collaborating Authors

 Viana do Castelo


Multi-Agent Multimodal Large Language Model Framework for Automated Interpretation of Fuel Efficiency Analytics in Public Transportation

Ma, Zhipeng, Bahja, Ali Rida, Burgdorf, Andreas, Pomp, André, Meisen, Tobias, Jørgensen, Bo Nørregaard, Ma, Zheng Grace

arXiv.org Artificial Intelligence

Enhancing fuel efficiency in public transportation requires the integration of complex multimodal data into interpretable, decision-relevant insights. However, traditional analytics and visualization methods often yield fragmented outputs that demand extensive human interpretation, limiting scalability and consistency. This study presents a multi-agent framework that leverages multimodal large language models (LLMs) to automate data narration and energy insight generation. The framework coordinates three specialized agents, including a data narration agent, an LLM-as-a-judge agent, and an optional human-in-the-loop evaluator, to iteratively transform analytical artifacts into coherent, stakeholder-oriented reports. The system is validated through a real-world case study on public bus transportation in Northern Jutland, Denmark, where fuel efficiency data from 4006 trips are analyzed using Gaussian Mixture Model clustering. Comparative experiments across five state-of-the-art LLMs and three prompting paradigms identify GPT-4.1 mini with Chain-of-Thought prompting as the optimal configuration, achieving 97.3% narrative accuracy while balancing interpretability and computational cost. The findings demonstrate that multi-agent orchestration significantly enhances factual precision, coherence, and scalability in LLM-based reporting. The proposed framework establishes a replicable and domain-adaptive methodology for AI-driven narrative generation and decision support in energy informatics.


A Cost-Effective Thermal Imaging Safety Sensor for Industry 5.0 and Collaborative Robotics

Barros, Daniel, Fraga-Lamas, Paula, Fernandez-Carames, Tiago M., Lopes, Sergio Ivan

arXiv.org Artificial Intelligence

The Industry 5.0 paradigm focuses on industrial operator well-being and sustainable manufacturing practices, where humans play a central role, not only during the repetitive and collaborative tasks of the manufacturing process, but also in the management of the factory floor assets. Human factors, such as ergonomics, safety, and well-being, push the human-centric smart factory to efficiently adopt novel technologies while minimizing environmental and social impact. As operations at the factory floor increasingly rely on collaborative robots (CoBots) and flexible manufacturing systems, there is a growing demand for redundant safety mechanisms (i.e., automatic human detection in the proximity of machinery that is under operation). Fostering enhanced process safety for human proximity detection allows for the protection against possible incidents or accidents with the deployed industrial devices and machinery. This paper introduces the design and implementation of a cost-effective thermal imaging Safety Sensor that can be used in the scope of Industry 5.0 to trigger distinct safe mode states in manufacturing processes that rely on collaborative robotics. The proposed Safety Sensor uses a hybrid detection approach and has been evaluated under controlled environmental conditions. The obtained results show a 97% accuracy at low computational cost when using the developed hybrid method to detect the presence of humans in thermal images.


Multi-Agent Based Simulation for Investigating Centralized Charging Strategies and their Impact on Electric Vehicle Home Charging Ecosystem

Christensen, Kristoffer, Jørgensen, Bo Nørregaard, Ma, Zheng Grace

arXiv.org Artificial Intelligence

This paper addresses the critical integration of electric vehicles (EVs) into the electricity grid, essential for achieving carbon neutrality by 2050. The rapid increase in EV adoption poses significant challenges to the existing grid infrastructure, particularly in managing the increasing electricity demand and mitigating the risk of grid overloads. Centralized EV charging strategies are investigated due to their potential to optimize grid stability and efficiency, compared to decentralized approaches that may exacerbate grid stress. Utilizing a multi-agent based simulation model, the study provides a realistic representation of the electric vehicle home charging ecosystem in a case study of Strib, Denmark. The findings show that the Earliest-deadline-first and Round Robin performs best with 100% EV adoption in terms of EV user satisfaction. The simulation considers a realistic adoption curve, EV charging strategies, EV models, and driving patterns to capture the full ecosystem dynamics over a long-term period with high resolution (hourly). Additionally, the study offers detailed load profiles for future distribution grids, demonstrating how centralized charging strategies can efficiently manage grid loads and prevent overloads. Keywords: multi-agent based simulation, multi-agent systems, agent-based modeling, electric vehicle, charging strategies, charging algorithms.


ChatGPT as Co-Advisor in Scientific Initiation: Action Research with Project-Based Learning in Elementary Education

Villan, Fabiano, Santos, Renato P. dos

arXiv.org Artificial Intelligence

Background: In the contemporary educational landscape, technology has the power to drive innovative pedagogical practices. Overcoming the resistance of teachers and students to adopting new methods and technologies is a challenge that needs to be addressed. Objectives: To evaluate the effectiveness of ChatGPT as a co-advisor in research projects and its influence on the implementation of Project-Based Learning (PBL), as well as overcoming resistance to the use of new pedagogical methodologies. Design: An action-research methodology was employed, including unstructured interviews and the application of questionnaires via Google Forms. Setting and Participants: The research was conducted in an elementary school, involving 353 students and 16 teachers. Data Collection and Analysis: Data were gathered through observations and notes in meetings and interviews, complemented by electronic questionnaires, with quantitative and qualitative analyses performed via Microsoft Excel and Google Forms. Results: The introduction of ChatGPT as a pedagogical tool led to increased student engagement and decreased teacher resistance, reflected in recognition at local science fairs. Conclusion: The study confirmed the utility of ChatGPT in school research co-orientation, highlighting its role in facilitating PBL and promoting cultural changes in educational practice, with proactive school management identified as a catalysing element in adapting to educational innovations.


Human Body Pose Estimation for Gait Identification: A Comprehensive Survey of Datasets and Models

Topham, Luke K., Khan, Wasiq, Al-Jumeily, Dhiya, Hussain, Abir

arXiv.org Artificial Intelligence

Person identification is a problem that has received substantial attention, particularly in security domains. Gait recognition is one of the most convenient approaches enabling person identification at a distance without the need of high-quality images. There are several review studies addressing person identification such as the utilization of facial images, silhouette images, and wearable sensor. Despite skeleton-based person identification gaining popularity while overcoming the challenges of traditional approaches, existing survey studies lack the comprehensive review of skeleton-based approaches to gait identification. We present a detailed review of the human pose estimation and gait analysis that make the skeleton-based approaches possible. The study covers various types of related datasets, tools, methodologies, and evaluation metrics with associated challenges, limitations, and application domains. Detailed comparisons are presented for each of these aspects with recommendations for potential research and alternatives. A common trend throughout this paper is the positive impact that deep learning techniques are beginning to have on topics such as human pose estimation and gait identification. The survey outcomes might be useful for the related research community and other stakeholders in terms of performance analysis of existing methodologies, potential research gaps, application domains, and possible contributions in the future.


Psychophysiological Arousal in Young Children Who Stutter: An Interpretable AI Approach

Sharma, Harshit, Xiao, Yi, Tumanova, Victoria, Salekin, Asif

arXiv.org Artificial Intelligence

The presented first-of-its-kind study effectively identifies and visualizes the second-by-second pattern differences in the physiological arousal of preschool-age children who do stutter (CWS) and who do not stutter (CWNS) while speaking perceptually fluently in two challenging conditions i.e speaking in stressful situations and narration. The first condition may affect children's speech due to high arousal; the latter introduces linguistic, cognitive, and communicative demands on speakers. We collected physiological parameters data from 70 children in the two target conditions. First, we adopt a novel modality-wise multiple-instance-learning (MI-MIL) approach to classify CWS vs. CWNS in different conditions effectively. The evaluation of this classifier addresses four critical research questions that align with state-of-the-art speech science studies' interests. Later, we leverage SHAP classifier interpretations to visualize the salient, fine-grain, and temporal physiological parameters unique to CWS at the population/group-level and personalized-level. While group-level identification of distinct patterns would enhance our understanding of stuttering etiology and development, the personalized-level identification would enable remote, continuous, and real-time assessment of stuttering children's physiological arousal, which may lead to personalized, just-in-time interventions, resulting in an improvement in speech fluency. The presented MI-MIL approach is novel, generalizable to different domains, and real-time executable. Finally, comprehensive evaluations are done on multiple datasets, presented framework, and several baselines that identified notable insights on CWSs' physiological arousal during speech production.